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AH mifmxiw OF m sxatisticai. p»»Bmss of a mimmjm 
vsAstm OF s^ismmL of seuxzcsshzp 



1* CSl&CSHICTXOil 

It is ««U kmiim that statistical siffiificaaee, hy Itself, offers 
no guaraatee that the differaoce. relationship, or effect found in an 
ea^erii^t is of practical significance. If the son^le sise is eKtremly 
large-^as it is in sost adueatiimal research-^virtttally any oittiscule 
observed effect will be judged -si^ificant« at e<»v»tional alpha 
levels. 

In rea>gnition of ti&ls fact. th& practice is becoming sore and 
sore widespread of coining such a statistic as Bays' (1963) after 
finding a signif icait dif feroice by a Meat or a significant effect in 
an analysis of variaace. Ibis statistic offers an estiaate of the pro- 
portion of variabiUty of the depend«>t varii&le that can be attributed 
to the relevant indepeatot variable, in the p<9ulation. As stHdi, it 
Is an entreiwly ia^wrtoit at^ valuable adjunct to significance tests, 
this is so especially in educational research, idiere the practical 
ii^lmntation of a research finding is alsDst alw^ an expensive 
affair j one therefore t^ts to have reasonable assurance that die iia. 
proveii»nt dt» to an innovation wiU be of si^ib a fia9iitu.te as to warrant 
the cost, before es^xking <m a drastic, large-scale change in e«>ca- 
tlonal practices. 

AnoOier feature of edueatiml research is t^t it usually involves 
(or should involve) a aatipUcity of depenctent variables. Appr<^riate 
significance testing procedures for multivariate ««perii,»ntal designs 



have loag heen kamm, ffiftd are incxeasinsly co-alng to be used by educa- 
tional r^earehers. On the other hand, saajuxes of 8tren| tth of rela^ 
tioa^p (as against si»lficaace of relationship) In faativariate 
designs have beos virtually imesplored. ms is a great drawback for 
edmti<mal research, in view of the crucial ia^ortance of having such 
a measure for reasons Bentioaed above. As far as could be ascertained, 
the «ay sttdi taeasnres i^vocated to date are those proposed by X. U 
Saith (1971) end Tatsuf^ (1970). (The fact that both of these were 
proposed in the context of educational and psycbologica research 
attests the iiqportaice of a msure of str^igth of relation^ for 
noltivarlate analysis in 6c («»tional re^arch.) 1!he first of these, 
being based on st^idoim proc> . tr^, not yield {results] that are 

invarisit under alternate or jerlngs** (Ssdth, 1971) of the depei^t 
variables. Ibis seesis to be an mklesirable pr^rty. tto secoid, 
" nult» ^ ^ direct extension of Hays' ^ to SHiltivariate analysis of 
variance. It vas tentatively proposed <m an intuitive basis for oi^ 
way mmvk by Tatauoka (1970) and, indep^idently, by Soehdeva (1972). 

The objective of this stt^ was to e^^mine in detail the statisti- 
cal pn^rties of with a view to siq^lying a tlHN>retical justi- 
fication for it— or, if it turned out to be tteoreticaUy unsdund, to 
develop an alternative statistic that is soiAd. Tte sUatistic in 
question was <teflned (Tatsw&a, 1970, p. 48) as 



2 

til* 

mat 



III * sfe l?l 



«i»ew T and W ate the total wltlUa-grw SOT satriees, » is the 
total saaple sice mi4 K is the moiier of treataent groups. 

It is reco0&is^, of coarse, that for the parUetOar s^le at 
t^. the quantity 1 - A, to «hich «2 as H teads tp Inflnty 

(idiere A is Hiiks' Ukelihood-ratio criterion), is a natural mwltivar- 
late analogue of the correlation ratio, eta.«q«ared, and hence vmild be 
a aeasure of the piroportioa of g^ieralised ^^riaaee of the ds^end^ 
vari^le vector that is accounted for by the inde|i«id^ varidbleCs). 
But just m Bays fomd it desirable to define bis ^ a Nearly m- 
biased" estimate of i&at the correspimding proportion od^t be In the 
populatl«i, it is toir^le to have a ^learly ta^iased'' statistic in the 
wativariate ease. This was the aotivation for this sti^. 

2. amisnc raaucvATios op ktihash) «2 

<aven R p-variate non»l populatioos I) «ith co8«»a covari«ice 

matrix E and possibly differ«.t c«>troid8 (k - 1, 2 K>, the 

proportion of ^raliaed variance of the p variates attribut^le to 
differences a»mg (^troids may be define as 

2 , III 
(2.1) "''aaat - 1 - —— 

U ^ <««')/K| 

i*ere « - (a^^), £^ . 2 pi k - 1, 2 KJ 

is the <teviati«i of the k^ population mean from the general ^ for 
the J-th variate (or the effect parameter of the k-th srotq> on tl^ j-th 
variate) . 
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It united to get an esti&ate, as nearly tmbiased as possible, 
malt ^ ind^^dent rand<^ s^l^ of n olwervaticms e^ 

twm the K pc^puiatioas. Purely ftm aaalosy ^th Hays* (1963) estloated 

» in the tmivariate case* nasely 



SS. + MS » 
t « 

Tatsiwka <1970) prop^^Ad the folUidag quantity as a possible si«:h 
estimates 

where W and T are within-groupe aiul total STO-o£-s<|uare8«ai«l-cro8S- 
products (SSCP) aatrices, and M (•Kh) is the total s^le siae. 

Here «e pr^ent a laore ratimial justificatioi for this statistic, 
and cttisider other possible for^aations i^faich (as it turns out) are rejec- 
ted as being less suit^le than ei^ressKm (2.2). For wtational con- 
veniea<», the subscript Wt" will henceforth be <HBitted from both 

and M 
a»lt " aaiit* 

Let us denote, in »iditii» to the t»o SSCP aatrit^ cited abo^, 
the betwe^-groups SSCB matrix by B (^.T - M). It c«i thca be shotm, in 
exact parallel to tte univariate case diset^sed by Bays, that 

E(B) • (K - 1)E + n(aa») 

* * mm 

^ E(W) « QI - R)£ . 

Pr«B these relatimss it follotfs that 

<2.3) ECW/OI - K)J • £ 

and 
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Since the aatrlces m the iri^tHiaiNi side of equations (2.3) and (2.4) 
a» tliose i^iose deteniiiiaits appear in aspreseloii (2.X). it aeeoa leaaoa. 
able to use aom detensLs^tal fuaotioas of the aatrices imder the 
ejcpected-wiwi (^ratoKs <m the left to sepleee these detenttbiaate In 
^ttlag an estlaate of «2. Just iHiat itetenKaantai fi»ctio»s axe iq^co- 
prtate, ho r, is not iwdiatelsr obvious, be«M8e B(P) • 9 doss not 

B(|p|) - I9I. 

Several alternative %tofe of •♦sUdag the natrix pie" to construct 
4eteraUi«its yield the foUowlng candidates as estisators of w^t 

(2.5) , Im?/(H>E)| 

|T ■¥ W/<H - K)| 

(2.6) |B* (K -.DW/gi ,.K.| 

|T + W/CH..K)| 

(2.7) l?l - l»l - (K«1)|W|/(H>K)P 

|T| + lyl/CB « K)P 

(2.8) III - (S - l)|w|/(» - K) 

Ijl + - K) 

aiese four espressim were exaaiaed for their emivergence pro|»r- 
ties as N A necessary ccmdltion for an adoissible estimator is 

that it converge to 1 « A, f^re 

A - |w|/lTi 

is Wilks' (1932) Ukclihood-ratio criterion. It can be shown that 1 - A 
Itself converges to «2 as H * Also considered was the behavior of 
ead» expression as p increases. The results of £hese investigations 
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were es foUcvs: 

Siqprmiim (2.5) (^averges tol»AasH-»«. HMOver, for fixed 
B, it COB beeoae Begstltre for large valm of p. fhis exptessioa Is 
therefore disqualified. 

Bxpression (2.6) omverges to |l - f\\ instead of |l| - Ij"^! • 
1 Af therefore it is disqualified. 

Expression (2.7) eoaverges to 1 • A as H •'^ but it also so eon- 
verges as p -^^ • for fiss^ ^Hiich is ridioaous. 

&9cessi<m (2.8) converges tol-AasH-»-«, and esdiibits no 
'*ridlcul(nts** behavior as p increases. 

It therefore appears that expressicm (2.8) is the sost plausible 
candidate, aaong tlnise listed above, as an estiaator of ««. This expres- 
si«i is ^valent to that originally propo^ by Tatsw^ (1970) end 
giv«» in equation (2.2) ^»ove. It ^ also pn^osed. independently, by 
Saidideva (1972) in a sli^tly different fora. This statistic. d«M»ted 

is tto <mly one esaoitted in detaU l^reuader. It should be noted 
that. besi<tes expressiims (2.2) and (2.8), tm other equivalent expres- 
Bic^ for itf^ ares 

(2.9) ;2 . 1 . m 

CK - K) + A 
<2-10) ^2 « 1 - ■„ . ^ 

where tl^ 1^ are the eig^values of w"^ (p - K + 1 of vhicb wUl be 
unity lAen p > K - 1). aspression (2.9) spst directly shows that 
w^-^l-AasN-*--, ^Ue (2.1Q) is pn^ly the aiost convenleDt for 
coi^utational purposes. 



3. imBBmm tss ft^rsuaicms 

the proporUon o£ geedralleed vari«ned of die d6]iendent varlates 
accounts £or by oanltersliip to tte diffemt popuXatlm ims deflMtl In 
die previous sectloa by 

(3.1) «2 . 1 . 

^'fe e - (Oj^) fj . 1, 2 pj k . 1, 2 Kl 

ie the p X K matrix of effect peraaetere, and £ is the eoaaton eovariaace 
aatris of the R p-variate pepiOatioiis. 

Xa order to sake tak» <m m preassigoed vali»» it is conmieiit 
to diagoaaliae E aid (w'). of course, in teal Ufe. it is ineoneeivabla 
that theae two matricea vUX be siaaataii^msly dia^aaliaed by the saae 
traosforoatioa. Pbrciag tli«a to ito so, therefore, iiqioM soise peculiar 
coaatraint on die wmfiguration of tte K p<9ulation c«troida. Boi^r, 
die constraint is not such as to r^uce the di»ensionaUty of the wmt- 
fold in i^ch die K cantroida lie (i^ich is p or K - 1, whichever is 
saaller). Bence, the omtratot should not result in an artifactual 
loss of gsneraUty of the saa^ling distributions of cmtroids of inde- 
penitent random sao^les froa the K popiulatiais. 

Let die diagonaUsed form of £ aad (oa*)/K be 
(3.2) E* m v'EV . 

where die coltans of V are the orthcaioraal ei^nveetors of £. a>d 
(3.3> (cf*a**)/R - (V«o)CV'a)»/K, 

respectively. !^te the diagonal eleiaents of E* by o* (1-12 

...» p) and the non-zero diagonal elesents of (o*a**)/K by a* 

- ' ' - kk 

fk « 1, 2, r - ain(p,R-l)]. The detenatnantal ratio in egression 



(3,1) for 01^ tliea reduees, SMcessively, as foUonst 



(3.4) 



Ijl |VE*V»| 



Is* (a«a*»)/K| 
P cr* 



.1 



since a*jj « 0 for j > r. 

BMW, In order to asslsn a specific value, amy p, t© ««, it is 
ne<^8ary «aly to let a*^^ tate values that satisfy tto e«Mlitien 
r 

£108(1 + - -logd - P) . 

Since it does no violence to Ut the a*^^ he pro|K>rti<»al to o*^^ . «e 
stajr for slnq>llclty let 

as, . pi''' - ■ « 'o, 1 . X. 2 , 

(P J - «+l p (when K-1 < p) 

The eleaents o*^^ of £* are, of course, predeteraiaed «Hice we specify 
r. These, togetter with tlMt assigned value P C- 0.1, (0.2), 0.9, say} 
of determine a*^^ in aceord«>ce with equation (3.5), and the (^ra- 
tion of the K populatiims is conplete. Varying S so that the avera^ 
intercorrelations aaong the variates wiU he lew or sederate, we have a 
neans for siswlating sets of p^^^ulatlons of any type encountered in 
educatiotal researdi. 



4. SAmJSSG PSOCSDOBES 

Bow that the popuiatloas ham beim get»rate<i so as to have vartous 
pmaalgiied valvea for «2. tte next step Is to sifaOate aw^Uag fcoa 
these populatioas. Sinoe sanple cetrarloiee fflatrlees and saaple os- 
teoids ase in^peaiaatly distribute uteii swi^les aca dram froa multi- 
variate noma populations, these t«o Mpects of the sat^liAg «ay be 

quite separatelir from each otter. 

Vqv generatiag simulated saapU cu»varlani» matrices, the OdeU- 
Feiveson (1966) proi^ihtre is estrf»li8he. A cm^et program 
wrtttffli by IteitaneUl (1971) was utilised for this purpose. The only 
modification necessary in tte pre^t context is that «e need to simulate 
aie ssi^ling of a covariaace matrix from each of R populaticms with 
iiteatical covarlaace matrix J. Wmmmr. since i» ultimately need only 
the pooUd ^ithin-groups SS(? matrix W, we may, for expedl«ice, siimOate 
the sas^ie covarlance matrix C for a single s^U of «»dnal slae 
» - K + 1 (fdtere H - n^^ ♦ a^ + ... + n^), end then multiply this covar- 
laace matrix by H - K to produce the d^ired matrix Wj 
<*-l> W « (R - !0C . 

Simulating the sasqyling of oatroids so as to get the desired 
betweeo-groups SS(7 matrix was, as far as could be ascertaii^, a new 
problem encountered in this study that is not discussed in ^ literature. 
Xt was accmq^Usfaed as follows. 

The diagmal matrix o*o*» i^iose diagonal elem»ita were generated 
in accordance with equation (3.5) is. by definlti«i, the cross product 
of the effect-parameter matrix when tte variates are expressed in 
canonical form. Any p x K factor matrix P of o*o*' that is centered by 



K 

rows (since t o* . " 0 by definition) should, therefore, qualify as 
a natriss of effect paraaeters. Assuaing (as will usually be the casi?) 
that p > K - 1» such a satrix F can be generated by tak^g any px(K-l) 
factor tnatrix G of o*a*' and p^tsailtiplyiag It by a (K»l)sR isatrlx B 
that Is entered and orthonormal by torn, tba alu^leat natrix to use 
as G is 



9 <- 
<P,R-l) 



^a' 



0 



0 



-p-K-fl,K-l 



1/2 

that ie, the upper-left (K-l)x(K~l) segiBsnt of <a*a**) ' augiBented 
beloir by a null aatrix of order (p«Kfl)x<K-l) • (^e aatrix «hldi satis- 
fies the requi relents for R is that ^lose rows consist of the coef fi*> 
d^ta >f the set of teliaert ctrntrasts^ with the sot of coeffioi<mts for each 
^trast normalised to unity; that is. 



B • 



(K-l)c 



■*1 



1 ^1 
(R-2)C2 

(K-3)C3 

0 
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wliete die (k « 1, 2, K-1) m the revise oon^allaliig j»atl- 
pUers UK - k)^ + (K - k)j"^^^. 

Thus, taking F « ^ as our effect-paraaeter natrix, we have an 
eligible Of* matrix of the form 



<4.2) 



a* * 



-p-K+l,K 



1/2 

vhere >j^lj^ stands fot the upper-left (K-1) x (K-1) segment of 

1/2 

(ai^*«) , (mien p - 1, contrary to the assia^tion ^>ove, G will 
1/2 

be (a*o*»> itself without the sttgeenting null matrix, and will be of 
order p x p. Ue then take as B the matrix consisting of the first p 
rows of the H displayed above, and our o will be sia^ly (o*o*')^^^.] 
By def initial, the general element of a is 

but my let eadi y*^ » 0 without l«i8 of ^nerality, since this 
a»rely involves a translaticai of the axes so that the origin coincides 
with the general centroid« Thus, we may take the (1 ,k)-elaisent of a* 
as defined by equation (4,2) as the s^an y*^^ of the k-th population 
on the J-th variate in canonical form, (^on^quently , we may sioojlate 
s^q^Ung the k-th s^le mm on tte j-th canimical variate (ln^]Mni- 
doitly over both k and j) by taking 



(4.3) 



o*^^ is the variance of the J-th canonical variate (i.e., the 
J-th eigenvalue of the coaraxi populaticm covariance imtrix £); 
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Oj^ la the k-th sasaple size («y/K In the present c<»teKt)'s 
and 2 . H(0, X). 

Once X*^j^ has been deteroli^ for each sasq^le <k) for a given 
canonical variate, the grand mm for that variate is obtained as 

K 

and the j-th row eleiaents of the betifTOn-grotq>s SSCP matrix are 
tained as 

<^.5) <?*>jr " J.^Hcfi^V^^J^^^rk-^^'^ Cr-1.2 p). 

Bepeating this for J « 1, 2, p, w get the betmn-grottps SSCP 
matrix |* in eammical form. 

Bie matrix B* is then "uncanmicalised'* by ti^ transformation 
inverse to that displayed in equation (3.2) { namely, 
(4.6) 8 - VB*V* , 

and we get the SSCP matrix B in the original dep^tdent-variate space. 

For each set of K slaailated samples under each ccmdkinaticm of 
sas^ling conditions described in the next section, the matrices W and T 
nee^d for coa|>uting «2 equation (2.10) were determined by getting 
tf in accordance with (4.1), B in accordoice with (4.6), and finding 
T «» W -f B. The co^lete cos^uter program is shown in ^p^idix A. 

5. simtmc coiDXTicais 

The types of populations from which saaples vete to be drawn have 
already h&ea partially indicated in the pre<^ding section. Itore spe- 
cifically, it was planned to have two levels of average intercorrela- 
tlon ffl»mg the variates in the eomson covarlance s»trixt low 
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(0.10 - 0.30) and mo6&v&t& (0.40 - 0.60). The sets of populatim 
were also designed to have five le^ls of values; 0.1, 0,3, 0,5j 
0.7, and 0.9. Additicmally, three levels were used for the nui^r of 
variates: p » 3, 5, and 10. The nta^r of populatiois in each set 
was fixed at R « S. 

Froa each ofthe2x5x3«30 sets of £±^ populaticms each, 
sai^les of three sises were to be drasm; naaely, n « 15. 30, and 60 
frwa each of the five populatim in each set, yielding total sas^le 
sizes of N a 75, 150, and 300 for each set. (Since this is iM>t a study 
of robustness of a statistical test under violations of assus^tions, 
it was not deesed necessary to vary the saiqple sises across populations 
in each set.) 

Thus, there vere a total of 90 sa&pling conditions. Onder each of 
these co&ditioEis, 1,0<K) indep«&dent, random sai^les were to be drawn. It 
was anticipated that this nua^r of sauries would be sore than suffi- 
cient to achieve adequate approxia&tifm to tte true saaQ»ling distribu- 
tions of w^. 

6. BESDLTS 

Although it was originally planned to use sets of populations 
having two levels of average intercorrelati<Mi asong the variates, pra- 
liadnary investigations qui^y showed that the atagnitude of average 
correlatimis had virtually no effect cm the saoi^ling distribution of 

The ateans for three pairs of sets of sasBpUng distributions laider 
settling cxmditions differing <mly in the average correlatioi» in tte 
populations vete as shown in Table 1. (In retrospect, this lack of 
dependent of the san^ling distriJmticms on the average correlation 
could have been anticipated; the sailing distribution of the squared 
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aailtlple correlation coefficient— of Mob 1 - A is a generalizatKm 
for K > 2--<loes not dep^d <m the particwlar <»>variance structure which 
produces a glv^ population p2 value.) It was tl^refore (tedded to 
confine further work to population sets with average correlation in 
the «edltffl^low range of 0.20 - 0.30. The (conaon) correlatl«« satrix 
for each of the three sets of populaticms used are shoim in ^pendix B. 



Table 1. Heans of saiapling distributions of ^hree 
pairs of sets of populations , the sets in each 
pair differing <mlf in the average correlation 
(p) in the populations 









■..1 II 
















0.9 


0.7 


0.5 


0.3 


0.1 


p«3, 11-75 


P ta 
P ■ 


.16 

.28 


.9U3 
.9112 


.7296 
.7297 


.5482 
.5479 


.3696 
.3705 


.1970 
.1970 




P • 

^ a 


.24 
.56 


.9192 
.9187 


.7551 
.7596 


.5956 
.6040 


.4382 
.4462 


.2777 
.2857 


P-5 ,1^1000 


P » 
P * 


.24 

.56 


.9017 
.9000 


.7047 
.7041 


.5071 
.5067 


.3109 
.3140 


.1143 
.1141 



It also hecmsB apparent as cosputations proceeded that ex- 
trea»ly positively biased, especially for population sets with low «2 
valiffis, when the ratio N/p (of total saiaple size to nw^r of variates) 
was any lower than 40 or so. The original plan of using sauries of 
sizes 15, 30, and 60 fron each populatiai (i.e., total senile si^e 
» - 75, 150, 300} was therefore £d»andi»ied in favor of «ie in tdiich the 
N/p ratio had vali^s 50, 100, and 200. This change was deemd 
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appropriate sln<« it apj^^^ that. In onter to get «iythlng reselling 
a realistic estimta of the N/p ratio had to be of these orders 
of magnitude. In retrospect, however, the decision may not haire been 
a vise one, for reasmts described below, 

Dt» to the exorbitant cost of coo^tatiims for the 10-varlate 
cases, tte nuid>er of sa8|»les drmm was reduced turn 1.000 to either 
200, 300, or 500. 

With the foregoing aodlflcations of tlm original plans for sampling 
"mditlons, tiie means of tiie g«ierated «,?»irical sai^llng distributicms 
of «2 under the various conditions were as sl»>wn 1a Table 2. 



Table 2. Hems of sani^llng dlstrlbutlims of «2 various 
nu^rs of varlstes (p) and saimle sizes (H), for 
IZT ?^ PopulatlOTs with »2 valt^s as Indica- 
ted. (Each set consists of K-5 populatlcms.) 



p 




0.9 


0.7 


0.5 


0.3 


0.1 


3 
3 
3 
3 


75 
150 
300 
600 


.9112 
.9059 
.9030 
.9016 


.7297 
.7129 
.7073 
.7038 


.5479 
.5262 
.5115 
.5058 


.3705 
.3398 
.3181 
.3101 


.1970 
.1461 
.1249 
.1127 


5 
5 
5 
5 


75 
250 
500 

i(m 


.9192 
.^56 
.9033 
.9017 


.7551 
.7159 
.7077 
.7047 


.5956 
.5287 
.5156 
.5071 


.4382 
.3425 
.3214 
.3109 


.2777 
.1539 
.1283 
.1143 


10 
10 
10 
10 


75 
500 
1000 
2000 


.9437 
.9071 
.9038 
.9016 


.8205 
.7222 
.7116 
.7052 


.7104 
.5331 
.5167 
.5086 


.6087 
.3452 
.3233 
.3124 


.4716 
.1650 
.1313 
.1164 
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laspectlon of Table 2 shows that, even for an N/p ratio as large 
as 200, the positive bias of ^2 is considerable. At this point it was 
suspected that, despite the reasoning la Sectiia 3, tte peculiar oethod 
of cwistructing the populations with predesigned «2 valiws aigfat have 
led to anoaalous sai^ling distributiwis of «2. U»erefore, an alterna- 
tive (and more expensive) i^thod, described in Appendix C, that did 
not depend on the eiaultaieous diag«iali8ati<a of £ and aa« was eiq^loyed 
to generate a set of five populatiims with p - 5 md 0,2 - 0.1. One 
thousand sas^les of sise 15 were draim from each of the five popula- 
tions thus grated, and the distribution oi ^ ma c<mstructed and 
coopared with that under the coi^arable sai!|»ling condition based on the 
elaailtaneous-dlagimalization proc^ure. (See Appendix C.) Tbe m&m 
of this saim>ling distribution was 0.2859, or 0.0082 larger than the 
a»an (0.2777) in the corresponding cell of Ti*le 2. It was therefore 
concluded that the positive bias of ^2 „^ ^^^^^ 
the populatiwis generated by the sliaataneous diagenaliasaticai sethod. 
Evidently, our attest to develop a "quasiimblased" estinator of «2 
was tmsuci^ssf ul . 

Correcting the Bias In ^2 

Various atteiq»ts to develop alternative statistic that would 
estimate «2 unbiasedly (or nearly so) were made but proved to be of 
no avail. It was tl^refore decided to try to develop a fortaila for 
correcting the bias in w2. 

Close scrutiny of Table 2 revealed tAat, within each row (i.e.. 
for fii»d p and N). the amount of bias, «2 . ^2^ ^^^^ ^^^^^ 
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a linear ftmction of 1 - Jl^. Thin la^reasion was put to a test as 
follows. For each row in Table 2, a straight line 

was fitted by the least-squares atethod. (That the intercept in equati<m 
(6.1) imist be 0 follows from tl» fact that thsoreticaUy show 

no bias wh^ «2 „ 1 J ProB this equatlmi, "corrected" ^ues of 5^ 
were coBq)uted as 



(6.2) 



corr 



- ffl(l - »2) ^ 



and these were correlated with the true «2 values within each row. The 
results were as shoun in Table 3, 



Table 3. Means of snarling distributions of {corrected by 

corr ' 

equation (6.2)3, the proportionality coos tan t m, and the 
correlation beti^ «2 and for varim» p and N 



p 


N 


0.9 


0.7 


0.5 


0.3 


0.1 


m 


V 


3 
3 
3 
3 


75 
150 
300 
600 


.9010 
.9007 
.9004 
.9002 


.6985 
.6969 
.69^ 
.6997 


.4958 
.4999 
.4983 
.4989 


.2979 
.3031 
.2997 
.3005 


•1044 
.0986 
.1013 
.1003 


•1153 
.0556 
•0270 
.0139 


l.(KK)0 
1.0000 
1^0000 
l.OCOO 


5 
5 
5 
5 


75 
250 
500 

i(m 


•8995 
.8996 
.9002 
.9001 


.6955 
.6980 
.6984 

.i(m 


.4972 
.4990 
.S(m 
.4993 


.3015 
.3010 
.2998 
•3000 


• 1020 
.1005 
.1006 
.1003 


.2433 
.0631 
.0318 
.0158 


1.000:* 
1.0000 

i.imo 


10 
10 
10 
10 


75 
500 
1000 
2QQQ 


.9027 
.9002 
.9004 
.8998 


.68^ 
.7016 
.7013 
.6998 


.4992 

.4995 
.4998 


.3234 
.2966 
.2992 
.2999 


.0863 
.1030 
• 1004 
.1003 


.7291 
•0742 
•0356 
.0182 


.9989 
1.0000 
1.0(KK} 
1.0000 
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It ^en^ iadisputiU>le, from the results exhibited in Table 3, 
that a correction term linear in 1 - «2 ^^^i^ suffice for each p and 
M (for fixed K) in order to get a nearly unbiased estiiaator of w^. 
The pr<^le& was to dtetenaine the functional dependence of the propor- 
tionality coefficient m on p and 8, jand on the hitherto unvaried K 
(the nuaber of p<»pulati«i8) • 

At ^is point, the earUer decision to use different sao|»le sizes 
for different nussbers of variates (in order to control the M/p ratio) 
proved detrimental. He were left with insufficient data points ade- 
quately to conjecture the relation of m with p for fixed N, and that 
of m with N for fixed p. Nevertheless, there were enough grounds for 
surmising that n was approKiaately inversely proporticmal to N and 
roi^ly directly proportiwial to p, as a scanning of Table 3 shows. 

The foregoing ^»servations, coi^led with knowledge that the quanti- 
ties 

M - 1 - (p + K)/2 and p(K - 1) 
ottea appear la p-variate, K-sample probleos, led to th& tentative con- 
jecture that tte a in equaticms (6.1) and (6.2) oi^t be e3q»ressible, 
ai^roxloiately, as 

•'<^' rrf^irW 

where c is a proportionality constant to be detenained. 

In order to perait greater flexibility in the curve-fitting ven- 
ture, hot^ver, it was decided to use a more ^neral form as the con- 
jectured relation between m and its three argummits: 

(6.3) a(K, p, H) m cM®Q*> , 
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vith 

or Bom siiBllar quantity dependent largely on and 
<«-5> Q - p(K - 1) 

or sone other quantity dependant syrotrically «» p and K - X. The 

syfflaetry »ith respect to p and K - 1 was c<»jectured froa the fact 

that a p-variate, K-sa^le problem can be xecast as a cawmical-corre-. 

lation problem with p variates in om set and K-4 in the other. 

<k«Aining equations (6.1) and (6.3), we arrive at the least-squares 
prc^lea 

<^-6> (w2 - «2)t . ^ -2) ^ 

i*ete the emstrats c, a, ami b are to be detemined on a least-equaies 
basis, and M and Q are 08 conjectured in oquatiims (6.4) and (6.5) or 
similar expressicms in N, p, and K. f^g logarithms of both sides 
of equati<m (6.6), the quantity to be minimiaed is 

(6.7) E « Sflogc + alo^ + blogQ + log(l « . ^)]2 ^ 

where the simmiation is taken over all data points. 

The normal equation for determining the optimal values of c, a, 
and b is 



(6.8) 



Slo^ 2;iogQ 
Slo^ Z (lo^2 J ilo^) 

SlogQ Z(logQ)(logM) E(logQ)2 



logc 

8 

b 



-£logW 
-£(lo^(logW) 
-2(logQ) (logfJ) 



where Nj^ is the nw^r of data points, and W « (I « «2)/(-2 . ^2^^ 
Equaticm (6.8) enables us to solve for Uast^quares optimal 
values of a, b, and c under various plausible conjectures for H and Q 



•19« 



ERIC 



basics those stated la equatlcms (6.4) aad (6.5). ite «ould then 
favor choices of M and Q that lead to tha smallest value of E as de- 
flaed in eqoatlon (6.7>-^ubiect, of course, to revle., upon cross- 
validating <« data points generated by saa^llag conditions not Indu- 
ced In the optlslziag process. 

To carry out the calculatlims , It vas now nec^^ to vary K 
instead Of fixing It at 5 as «as done up to this point. It was decl- 
^d this tl8« to fix p at 4 and use K . 3. 7. and 10 In co.d,lnatlon 
With H . 75 and 500. TO keep down coi^uter costs, the n«Aer of samples 
dra«n was reduced to 200 for these runs. The aeaas of the resulting 
sai^llog dl8trlbtttl<ms t^re as shmm in Table 4. 



S^;S populati™ with «2 values ^s 

of populations. Tte n»^r of varlarea Is p - 4 
throughout, ^ 



N 

75 
75 

75 

S(K) 
500 
500 



K 

3 
7 
10 

3 

7 
10 





0.9 


0,7 


0.5 


.9071 
.9234 
.9344 


.7189 
.7720 
.7976 


.5304 
.6058 
.6543 


.9(m 
.9034 
.9051 


.7(m 
.7086 
.7152 


.5046 
.5135 
.5235 



0.3 

.3513 
.4666 
.5293 

.3062 
.3274 
.3343 



.1704 
.^54 
.3922 

.1087 
.1315 
.1499 



The data in the 12 rows of Table 2 and those In the six rows of 
Table 4 (90 data points In all) were fed Into equation (6.8) for 
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detennlning the least-square eecisates of tt» paraaeters in eqtmtlcm 
(6.6), with M aad Q as inltlaily mjectured in equatKms (6.4) and 
(6.5). Solution of equation (6.8) with these Inputs yielded 

logo « ^.4341 (or c - .3680), a - -1.(^77. b « X.3631. 
Thus, the correction fonaaa for «2 „^ ^ ^ 
4 and the conjectures of equations (6.4) and (6.5) Is as foUows: 
^^•^^ "^corr " - •368[K.l.(p«)/2J-*-^"fpCK.i)ji-3631^^2,. 
The 90 corrected «2 values resulting fttm use of this formula were as 
shoim in Table 5. %e root ae^ squared error for ^ese data fioints 
was 0.0108, and the correlatl<m between tmd m 0.9993. 

Table 5. Means of s^^llng dlstributiais of »2 f^r 22 com- 
blnatl«Bs of p (nundier of varlfi&les) , K (nusdter of 
populations) aid R (total sai^le slae), corrected 
la accordmce with equation (6.9). 



«2 



K » 0.9 0.7 0.5 0.3 



0.1 



? f ,J! -^^^2 .4952 .2971 .1033 

? f 15S -^^'^ -SOOS .3044 .1003 

? f IS '!*^ -^552 .3010 .1029 

I I S? 'l^^ '^^^^ •3<>" -1022 

f I -S!! --^^ -3019 .1017 

I i -^^2 .5015 .3017 .1030 

in I 'tS^^ '^^^ -3014 .1021 

,? ! ^5 '^79 . 7064 .5262 . 3599 .1356 

}2 f -^^^ -^^3 .4980 .2960 1023 

iS I 2^ -SSS -'^^^ -^^'5 -2993 iSi 

^ \ ^ -f^ •I°°2 .5005 . 3008 .1015 

1 I k!^ '!S^ -^^2 -^^^ -1^1 

J ? ^ -S^^^ -5°^ -1013 

4 7 75 .8999 .7020 .4847 .3028 .0921 

i ,n ^ -^^'^ -^954 .3023 .0991 

A J2 «I! -5^^ -^^'^ •*654 .2721 .0601 

4 10 50) .8989 .6967 .4925 .2910 .0946 
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Various ocher coajeceuxes for M and Q, such as H « fif>({»fS;}, M » N, 
Q - pHC-l, Q o p2+(K-l)2 were tried out, each coablnaclon giving rise 
to a table esseatlaUy slmlXar to Tid»le 5. These are not laeluded here 
because there Is little point In presenting a lengthy series of slallar 
tables. The three nost proalslng coablnatlons besides that consisting 
of the eiqtresslons given In equatlcais (6,4) md (6.5) led to the fol- 
loirlng estimated val«^ for the paraaeters c, a, and b: 

M - !l-l-(pfR)/2, Q m p2+(K,i)2j c « .2801, a • -1.0692, b « 1.1343 
M - M, Q - p(K-l)s c m .4358, a - -1.1048, b - 1.3899 

M • H, Q « p2+(R-l)2j c « .3041, a - -1.1066, b - 1.1579 

In or^r to test the effectlvt^iess of th& corrections using these 
coo^lnatlons of parang ter valt^s on a new set of data, the n^ans of 
sas^llng distributions of f^m. new con&lnatlons of p, K, and N 

were confuted. The resulting 20 data points were as shown In block 
(1) of Table 6. 

Blocks (11) - (v) of T^le 6 st»w the corrected values using 
the four coablnatlons of M and Q with the values for the paraaeters 
c, a, and b as cited above. Also shown, at the bottoa of each of 
these blocks. Is tt^ root-^mi-square error ivmS) for that set of 
corrected «2 values. Ssaalnatlon of these RMSfi values m6 a scanning 
of the table Indicate that the corrections sees quite adequate. All 
tJ» «^corr ™ correct at least to one digit, and about two- 

thirds are correct to two or more digits. 

In ten&s of the RMSE values, the two cos&inatlm&s using Q « p(K-l) 
yield sofMn^at better corrections than do those using Q ■ p2<4'(R-l)2, 
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Table 6. Hmds of saiq>liag distributioas of coaiblaatiwis 
of p (m^r of variables) » E (nia^r of populaelms) . and 
N (total sas^le else): (1) as observed; and (li) - (v) 
correeced for bias by stabtractiag out the follovlxig correc- 
ti<m teniBs 

(II) .36«0rJI-l.(iH«)/2r^-®^^^p(K.l))^-^"^a - ^) 

(III) .280lCH-l-(p<«)/2J-^*^*2jp2^^^^j2jl.l343^^ ^ 

(Iv) .4358 ll'"^-^°^*Ip(K-l) - £2) 
(V) .mi M^^-^*^^Cp2+(K.l)2]^-"7^a . ^2, 

Also shown are the root mm squared error » MSB, for each 
o&t of corrected 0^ values. 



»2 





p 


K 




0.9 


0.7 


0.5 


0.3 


u.x 




4 


6 


150 


0 9X18 


.7288 


.5489 


.3628 


.1909 


# ^ %. 

(1) 


4 


5 


300 


o7VfO 


.7128 


.5164 


.3245 


.1355 


7 


4 


3Ck) 


.9030 


.7194 


.5291 


.3403 


.1522 




t 




4(K) 


.9092 


.7309 


.5502 


.3673 


.1900 




4 

« 


6 


150 


.9022 


.6994 


.5003 


.2938 


.1032 


(11) 


4 


5 


300 


.9013 


.7021 


.4984 


.2993 


.1033 


7 


4 


2m 


•89^9 


.7042 


.5036 


.3046 


.1063 




7 


8 


400 


.8977 


.6969 


.4934 


.2874 


.0877 












(BlfSl 


S « .00518) 






4 


6 


150 


.^36 


.7035 


.5069 


.31)35 


.1156 


(lil) 


4 


S 


300 


.9017 


.7034 


.5006 


.3024 


.1072 


7 


4 


300 


.8989 


.7013 


.4987 


.2978 


.0976 




7 


8 


400 


.9014 


.7077 


.5116 


.3130 


.1204 










1 


(ffifSS « .00774) 






4 


6 


150 


.9021 


.6988 


.4990 


.2924 


.1015 


(iv) 


4 


5 


3(K> 


.9012 


.7020 


.4982 


.2990 


.1029 


7 


4 


3(K) 


.8998 


.7040 


.5032 


.3040 


.1056 




7 


8 


400 


.8974 


.6959 


.4917 


.2851 


.0847 












(KHSB 


- .00594) 






4 


6 


150 


.9034 


.7031 


.5062 


.3024 


.1143 


(v) 


4 


5 


300 


.9017 


.7033 


.5004 


.3022 


.1070 


7 


4 


300 


.8988 


.7010 


.4^2 


e2969 


.0965 




7 


8 


400 


.9012 


.7073 


.5107 


.3118 


.U89 












(S»SE 


» .00718 
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In particular, cha H and Q originally c<»jectiired and stated in equa- 
tions (6,4) and (6.5) give the smaUeat VmS. Pitted against this 
advantage, however, is tte fact that the values of b for the two coo- 
binati^ma involving Q « p2+(K-l)2 are closer to unity than they are 
for the other two. (Stote that all four values of a are quite close 
to -1.) Observing further that the value of c for the last con&ina- 
tion is close to 1/3, we are led to an alternative, siaplcr fonmila 
that should be sufficient for providing a rou^, •^rule-of-thuid>" correc- 
tion; naoely 

Osing this rule-of*t:hiad> correct iim for ttm 22 (p, K, H)-conbina- 

tions (UO data points in aU) that were considered in the fore^ing, 

the corrected »2 values shotted the following distribution of nui^r of 

significant digits in agreea^nt with the true values: 

20 agreed to 3 digits, 
55 agreed to 2 digits, 
33 agreed to 1 digit, 
and 2 agreed to 0 digit. 

(The two showing 0-digit accuracy were larger by 0.1 than the trm 
values, when rounded to one decimal place.) 

It therefore seeos safe to conclude that, at least within tte 
limits of the p (nua^r of variates) , K (nt^Ur of populatiwis) md 
N (total saa^le size) valt»s that were exaoined, the single correction 
forratla presented in equaticm (6,10) will suffice to reduce the bias 
in « to less than 0.05. The ecmstraints are that p(K-l) < 49 and 
75 <^ N <^ 2000. It reisains to be seen how the correction fonmtla will 
vork outside these Units. 
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Conftience Intervals for 

Anottor approach to estisatiag the population fr«a even a 
badly biased statistic is to construct charts for getting confidence 
intervals. For this purp<^ it is uni^cessary first to eliisinate (or 
even reduce) the bias in u^-^tovi^d we are villing to adopt confi- 
dence intervals that wiU often not evan include the <^served statis- 
tic value, especially when p and/or K is lar^. 

Arbitrarily deciding to construct syi^tric 902 confidence inter- 
vals (symtric in the sense that 5% is excluded at each tail end, but 
not syaaaetric about the <&served 22)^ the 5th and 95th centile points 
of the saa^ling distributicms for selected cosd>inations of p and N, 
with K fixed at 5, were confuted. The selection was based essentially 
on those cosdtinations represented in Table 2, excluding the largest 
M for each p. (Inspeetion of the saa^yling distributions for tlie 
largest-^ cases showed that the cimfidence intervals would reduce 
practically to point estinwtes within the accuracy of graphing for 
these cases.) Sampling distributions represented in Tables 4 and 6 
were not cmsidered because they wsre based on ooly 200 saiaples, which 
would iB^ the C5 and C^^ values lareliable. AdditienaUy, the sailing 
distributions for H « ISO with p - 5 were ct^i^uted anew, since the 
juaq? from N a 75 to H » 250 sees^d large la this ease. 

The 5th and 95th centiles, C3 and C^^, of the selected sai^ling 
distributions were as shown in Table 7. -nje values for the null dis- 
tributions (with w2 « 0) were cos^uted backwards from Raj*s (1952) 
approxiaiate equation relating A to an F distribution under the null 
hypothesis. This relation states that 
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Table 7. 5th and 95th centiles of saiapllag distributions of for 
3, 5, and 10 vari^les, each with thr^ selected staple 
sises and six true ur ^mlues. 



p ■ 3 
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M « 75 



N - 150 



8 « 300 



0 
.1 
.3 
.5 
• 7 
•9 



.0198 
.0725 
.2275 
.4083 
.6208 
.8728 



.2082 
.3333 
.5256 
.6796 
.8268 
.9466 







^5 




.0092 


.1068 


.(K)43 


.0550 


.0675 


.2425 


.0713 


.1828 


.2380 


.4505 


.2418 


.3945 


.4288 


.6225 


.4410 


.5828 


.6411 


.7854 


.6560 


.7568 


.8767 


.9348 


.8804 


.9225 



» • 75 



p « 5 



ti « 150 



H « 250 



21 



:2L 



0 

.1 

.3 
.5 
.7 
.9 



.0981 
.1513 
.2915 
.4650 
.6569 
.8798 



.3189 
.4125 
.5808 
.7164 
.84U 
.9547 



.0478 
.1131 
.2645 
.4575 
.6615 
.8808 



.1681 
.2925 
.4850 
.6425 
.8025 
.94(^ 



.0282 
.0934 
.2621 
.4550 
.6586 
.8838 



.1(^2 
.2225 
.4280 
.6041 
.7708 
.9299 



« 75 



'95 



10 









1000 








S5 


.(^51 


.0989 


.0225 


.05(^ 


.1245 


.2163 


.1011 


.1669 


.2925 


.3975 


.2875 


.3650 


.4850 


.5900 
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(6.11) " ^^'W^ * ^ - F 



where 



Vj, • p(R-l), • as - p(K-l)/2 + 1 , 



and 



a - H - 1 - (p4iO/2 
r 



7; 



p2(K>l)a , 4 
p2 •»■ (K-l)2 - 5 



Prom relatioo (6.11), It follows that 
A-^^« - I . hP . 

«iiere h « P^^"^) ,, 

m - p(IC-l)/2 + 1 * 

Consequently, tte 100o% point of the distribution of A t^en «2 « o 
nay be calculated as 

A^ - antUo8[^log(l+h F^^^ ^^)]. 

Upon substituting this value in equation (2.9), nas»ly 

*2 ^ t HA 

(H-K) + A ' 

W8 get the 100(l-«)2 point of the null distribution of «2, Using 

a - .95 mid .05, therefore, gives us t!» and C^^ vali;»8, respectively, 

of the saiq»ling distribution of ^^^^ ^2 « 0^ 

Figure 1 8ho«ra the grains of the upper and lower liaits of the 
sytBoetric 90% confidence interval for N • 75, 150, and 300 in the three- 
variate case. Similar graphs say be c^tructed for p • 5 And p * 10 
from the data glv^n in Table 7. 
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7. SUmSY m) CONCLUDING BEU&RKS 

The 9asqE>lins distrlbatloo of a oaasure of strength of relatioash^ 
In awltlvarlate aialysls of vsxiasace (MASIOVA}, proposed hy Tatsudca 
(1970) , was exastined hy a cM^uter sisulaUon tedmlque developed espe- 
cially for this purpose. The aeasure, denoted ^ and defined la equa* 
tlon (2.10) as 

^"*^V2 •*• ^p * ^ * 
where the are the eigenvalues of v'h (W being tte withln-gKHqw, and 
T the total SSCP aatrix) was found to be highly positively biased— 
especially when the populaticm value is saall. 

Although the OBount of bias steadily decreases with increasing H 
(indicating that ^ is a ccmsistent estlaator of it does not bec<^ 
negligible untU the M/p ratio exceeds 50 f or .50, and exceeds 100 

1 •30. This wBmxs that a study involving p - 10 variables imtst 
use at least 1,000 subjects before any realistic estimate of the popu> 
lation «2 can be <Ataii^. 

Since such large s^les are not ordinaniy used in typical studies 
(with the exception of large-scale statewide or nationwide studies), it 
becones ia^ortant to have a aeans for eUadnatliig, or at least reducing 
the bias in u^. 

Various atte89»t6 to <tevelop, theoretlcaUy, an alternative statistic 
with little or no bias proved futUe. An eapirical approach was ttore- 
fore adopted. Careful study of graphical plots of tte amount of bias 
revealed that a linear correction of the form 

^corr " " - 
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would suffice for my fixed ntis^r of variables (p) , nucftier of popula- 
tions or levels in t)» mmk factor (K), and total sai^le else (N). 
Furthen»re, the proportitmality ccmstaat a appeared to be approximately 
Inversely proportl<mal to M and (very roughly) directly proportional to 
p for fined K. 

15ie ehovQ observations, CM&ined with knowledge that p and K-1 should 
affect the am>mt of bias In the saae way, led to the conjecture that a 
correctioB of the fona 

"^corr " - ****<J^<1 - 
should be adequate. Bere M is an enpreaslon largely depmdmt m », 
Q an espressl<m syo^trical in p aid K-I, and c (> 0), a (< 0), and 
b (> 0) are paraseters to be estlsat^. 

Several different expressions for H and Q were tried out, and the 
parameters c, a, and b were estianted by the least-squares mthod^ based 
on 18 different c«d»lnations of p, K, smd N, with five «2 values (0,1, 
0.3, 0.5, 0.7, and 0.9), for a total of 90 data points. 

Cross-validation on 20 data points not used In the process of 
selecting optimal expressions for M and Q and estlsatlng c, a, and b 
led to the choice of the following correctlott fonaOas 

^^corr " «^ - + (K-l)2ji-^579^^ , -2,. 

This was siayllfied to a rule-of-thua3> correction fomOa 

corr li ^* 

The sl^^llfied fonw»U, when tested against all 110 data points 
(90 derivation points and 20 cross^alidation points) , yielded very 
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satisfactory results: agreeasnt with to at least one sigoif leant 
digit in ail but two cases » and agreement to two or three significmit 
digits in slightly less than two->third8 of the cases. This foremla 
was therefore <teened to adequate at least when p(K-l) < 49 and 
75 M <^ 20)0, 

At this point, the question naturally arises, if a furtl^r correc- 
tion is nea^d on idiich was designed to he a correction of sorts for 
1 - A (where A is Wilks* likelihood*' ratio criteria), why should one 
hother with a statistic laore cos^licated than 1 - A? The anstfer is, 
obviously » that me need not bother. This cmduslas was inplidt in 
Huberty*8 (1972, p. 525) statesent that **Xt is clear that nunerically 
the difference between n^^, n^g, asd a^^^ U practically nU," sade 
in a study cmi^aring four sultivarlate indices t»f strength of associa- 
tion, including 1 - A (n^^ in his notation) and £2, Hoi^r, the further 
conclusion that anyone of these ^awy be ea^loyed as indices of discrio* 
Inatory power of a set of variidbles" has now be^ sluswa to nlss the 
mark. To put it bluntly the present study, in coa^inatitm with Huberty's 
findings that they all yield nearly equal nuB»rlcal values, shows that 
they are all equally poor instead of equally good. 

The natural thing to do would seea to be to develop a correctly 
fonaula to be applied to either of the purely sa&ple^scriptive indices 
^\ ^^•««» i - A) or n^g (idiich is based on the trace of w"^). In 
view, again, of Bidserty's findings, it is Ukely that the rule-of-thtadl> 
correctiim formula given in equation (6.10) and cited above will suffice 
for all practical purposes. In particular, the correcti<» for 1 - A 
would be 



It nay sees odd chat yirctiaUy fkothing is knmm ^ut the ncmcflntral 
dlstribtttiim of so videly used mod long established a statistic as 
Hilks* A, but this i^pears to be the ease. Gupta (1071) derived the 
distribution la the special case tdnm ao* is of unit rank, aid asserted 
that, for the ^neral cem, the distribution *1ia8 not bemi e^xess^ 
in a nuaericall^r feasible fora*» (p. 1259). As far ^ could be as<^r<- 
tained, nothlns has iq^pear^ in the Uteratnre since 1971 to negate 
this assertion. 

Itesidea the corrGctlon-foranOa method, another neaas was presented 
for estinatiag froa «2j tu^y the ccmfideace intervals. 

Oiarts were given in Figure 1 f or p • 3 with H - 75, 105, and and 
data for constructing similar charts f or p « 5 md 10 iiere pr^ented 
in Ti^le 7. 
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APPENDIX As Coa^uter Program and Saa^le Output 

C0VM3N <FRO{200f 5)»PV1EAN{5) tUVlEANtS} •EMhAN{ 5 ) ,n;-iTxS 
!• ICUUNTtOMETR( 30»3w) tAVOAC 30) tTEMPEC VOO I tt.-lCI i»iOt 
1 1 lNDXff<!y»SDP( 30i 9lS<IP 
U0C2CAL IPAR(8) 

INTEGE««<^ IOuy{3C»2 I tlOtT 
INTEGfcR FMTC20I tB|.AN<l20l/20»« '/tlDliO) 
DIMENSION SIOMA(30»30) .TOOtSO) tSi^SliOl •! JC5) 
EQU I VAi. ENCE { NROa • I P } 
C«*#»# INPUT PARAVETERS 

C ID ; CARD OF IDENTIFICATION. FIRST 4 C0LUV\S APPEhj^ IN 

C Punched output 

C FMT input format fur PUPULATIuN COKKtLATlJNS* 

C IPA« LOGICAL FLAGS FOR CONTKOLLINu PKI NTtD OUTPUT 

C IS<IP FLAG FOR CONTKOLl. I .N6 POINTED ^UTPgT IN bU-jRwJTlNt 

C <GR NUMBER OF GROUPS 

C IX INTEGER STARTING POINT fOH RANDOM .^uvtitR GE.NERATOR 

C NVTXS NL'V.tJER OF SA^IPLE CORRELATION VATRICES TU tik wcNEKATtD 

C N SAMPLE SIZE 

C N?QW NU"'18fcR OF ROWS IN INPUT >.ATRIX 

R*:ADC5tl001) IDtFvT,IPARtIX»NnxStNt.vROwtiCoKt lOt ISI^IH* 
REAOfStFVT) <SDP{ n#I«liNROw) 
IS<IP«IS<IP*1 
NINT-20C 
WRITEt 6»100«» } lOilA 
DO 777 I»lt5 
OMEAN( I )«0. 
OMEAN( I )«0. 
EVEAN( n»0. 
00 778 J«ltNlNT 
KFR3{ Jf I )bO 
778 CONTINUE 
7 77 CONTINUE 

w/RITE(6tX032) NRO*t<GRtNtNWTxS 
CALL RN3IN2J IX) 
C READ 1% SIGVAfCALCULATL O-VEGA TRANS^^OS t » UPPt K Tr<I ANGULAX 
NaN-KGR+l 

DC ^♦739 I»l fNRUw 
^739 RtADCStFMT) {SIGMAJ I,U} , U«lt\.^jA) 

S«0. 

DO 250 I>lffNRCW 
DO 250 U-Xtl 
250 S»S^SIGMA( I »J) 

X» (NROW-l ) •NROw/2. 
S«S-NSOrt 
S«S/X 

«tfRITE{6f2001 } S 
I»n-2*J 

WRITE {6t 1033} {J tj«lt5) t {J ♦ jal tb } t CJ tt>) 
DO 2000 I«ltNROw 
DO 1500 «»I»\R0W 
1500 SICVA{ I,u}sSIoVA{U#n 

20CC SIG^A( itn>i» 

IFUPARf I) ) GO TO 911 
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copy mum 



30 901 L8l».\R0» 

901 wRiTti 6»1011 1 L»{Sl&v,A(u»f.) 

911 CONTINUE 

00 IC iBlffNKOw 
10 0VET^{ 1» n«SlG^"AI 1»1 ) 
30 200 I«2»\R3W 
lLESSlsI-1 
IPLU51*!*! 
SUVeO.O 

00 lie J«lflU£SSl 
110 SUM«SUV*OVETR{J»I )#OMET=^l J» I ) 

OM£TR{ I tn«DSOt<T CSIGWAf X ,1 )-SUY) 
IFtUEO.NROW) GO 200 
TtVP»1.0/OvLTRt If n 
DO 133 JsI^UUbltM-^w.* 
SUMsO.O 

DO 120 !C«'1»IU£SS1 
120 SUM«SL»M40vETRC<tn»0M£TRHCtJ} 
130 OMETR( I » J)»TEMP#( SIOv.AC I » J! -SUM) 
200 CONTINUE 

IFI IPAR{2) ) GO TO 912 

WRITE{691006) 

DO 902 U«1»NR0W 

902 WRIT£{6tl011 > U • C O.'ETRl v, l J t .Va 1 ) 

912 CONTINUE 

DU 2002 I«l9/\iR0w 
DO 2C02 jaliNROw 
2 002 T( I tJ)=SIGVA{ I »J)*SOP( I )*Si;^( J) 

KlVal 

CALL SB£TW( Tf NffNkOWff I X»<GR»I ;j9lP} 
<IM»2 

C 

C LOOP FOR ALL SAMPLE CCItkELATI ON MATRICES 
C 

205 NCCDE-N/100 

DO 900 ICOUNTal tNVTxS 
C CALCULATE TtLOwER TRIANGULAR 
DO 300 I«l9NR0M 
NOFa\-I 
DF«NOF 

CALL NORMAl(X{ 

X2«X*X 

X3«X»X2 

H60».30804<»lE'-0 3»X3-.15S9t>5 3E-03»X2-.92<»4112E-03* 
l*.1885979F-03 

H «( 60.0/DF) »H60 

TEVP«{2«0/(9,0»DF) ) 

T{ !»I >»DF»{ l.O-TEf^Pi'lx-Hl^DSQ^TlTE.XP} )»»3 

T( I »n *DS3'<T{ T U tl ) ) 
IPC I.EQ.l } GO TO 300 
ILESS1«I-1 
00 210 J«ltlLESSl 
CALL NORMAL(X) 
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2in T(I»J)«X 
300 CONTINUE 

IFUPAROI I GO TO 913 

WRIT£I6»1007} 

00 903 U*ltNROM 

903 WRITE(6tI011) U« ( T(U»M) •.yeXtL) 
913 CONTINUE 

C CALCULATE OMEGA^T.A.NO STORE SIGMA.AS A LOwt.< TKJANGULAK 
00 350 loltNROW 
DO 330 jBltl 
SiGVAf I tJ)«0.0 
00 350 K<>WtI 
350 5 1 G.MAC I •JJ^SIGMAJ I •J)*0^£TR{Kt I }»T( IC»JJ 
IFI IPAf^C^^J ) GO TO 9X<> 
wRIT£(6tIC08J 
DC 90<i LsltNROW 

904 WRITE(6»10X1) 1 1 ( SXGMAI L tNi » tMe i tL ) 
91<» CONTINUE 

C FOKM A MATRIXtIN TlUOwtR TRI ANGLEt 0\uy ) 

00 370 I»lffNROW 

00 370 J»ltl 

T{ ItJJ«C*0 

DO 370 <e:»J 
37C T{ I,j)«TlI,J)*SlGMA(ItK}#SIGMAIJ»K) 

IF{ IPARCSn 60 TO 9X5 

VvRITE{6»X009) 

DO 905 w«XtNROi« 
9C5 WRIT£(6»10XX) U • { T( t t M } ,M« X J 

915 CONTINUE 

DO 600 IsX«NRCm 
00 600 U«X»NROm 
6C0 T{I#J}8Tn»J)*SD^'m#SDP( J) 

CALL SSETwC TtNtNROM'»IX»KGRtIO»IP} 
IF( IPAR(8I ) GO TO 900 
C GET STANDARD DEVIATIONS! STORE IN F.^IT.A.NO CALCULATE f<,I,\ T 
DO 380 I«ltNROW 

SDSi I }»DSORTC TCI il ) } 
DO 380 JsXtl 

T C I • J ) aT C 1 1 J ) / C SOS C n »SDSU ) } 
38C TCUtn«TCI»JJ 

IF C IPARC 6) ) GO TO 916 
WRITt<6t XOlO) N 
C DO 906 L»ltNROW 

C P-RINT FIRST R0-* OF tACH CORRELATION MATRIX 
L»sX 

906 WRITh(6fl01X) Ltl T I L •.•U t.^* 1 i.\HOw ) 

916 CONTINUE 

IFC IPARC 7) J GO TO V17 
00 907 L«Xt\ROW 

DC 907 <»X»4 

^2».Xl + 9 

WRITEC 7,1020) IDC 1> •LfIC0UNT»NCODE.<t a (LiMlfVsv.i.M^) 

907 vxsva^x 
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BEST copy AVAOABIP 



WR2TE(7tl021) HLAN< 
917 CONTI.MUE 
900 CONTINUE 
950 CALL RN3\D2( IX ) 

WRXTEl6tl999! IX 

CALL EXIT 

1001 F0aMAT{20A^/20A4/8Ll»2X»l l0ti3J5) 

100«» FO«v,AT( 'It.aOA^^t'.lNTEGEK STARTING VAlU£=SI10> 

1005 FO«vAT( 'OINPUT CORKhLATION VATf<IX»/) 

1006 FOSVATMlSOUAi<£ KSwT FACTJKS OF CUkKcLAT I J.MiQ • / ) 
IOC 7 FORVATMIT MAT-ilx'/) 

lOOe FO«<«AT( • 10VEGA*T Vl 

1009 FORVATC^' A MAT^IXVl 

ClOlO FORVATC 'ISA.'^OLE COKkELATION MATRI X • • 16/ ) 

1010 FORVATC »-SA:v|3L£ CQKRELATION MATi^I X • # 16/ ) 

1011 FOR-^^ATI I3tlOF13.6/l3X,10F13.6) J 

1020 F0RyAT{A^t2I2t2Ilti0F7.5) 

1021 FORMAT {20A^ J 

1032 FORVATt' \0. OF DcPENOEUT VAt^IAttLES » ' t U/ 
1' \0. OF GROUPS s •»!<»/ 

2' SAMPLt SIZE m 

3» ^40. OF SAVPLES c • 1 1<» ) 

1033 FORMAT ( /19X» • l-LA.'-IBDA VALUES' t2lX 

It • OVEGASQUARE VALUES • ,2 3X • • t i<Ki;r<S • / /6X 1 1 6 

2tl4l6) 

2001 FORVAT{' AVFRAGE CORRELATION « 'tFS.i*) 

1999 FOR.'IATC 'OINTEGER STOPPING POI.\T» • 1 1 10 ) 
END 

SUBROUTINE S8ET.»v{ TtNtNROrttlX.i^.IU.IP) 
IMPLICIT RFAL»SC A-HtO-Z) 

COV.MON lCFRQ{20Ct5> tP^1EA^^ 5) tO-YEAN C 5 ) tEMfcA.Si 5 ) , NMTXS 
1 1 ICOUNTtOVETRt 30 1 30 ) t AMOA { 30 ) t TEMPER ^00 ) t EMC { 5 1 1 0 1 10 ) 
If lNDXf<I.y,SDP{ 30) tIS<IP 

INTEGER#<» IDUMC 30t2) tlNOIC(lO) tlOtT 
D I MENS I ON S I GMA «30f30}tT{30t30) t OVEGA I 30 1 30 J • DUMl { <»6 1> ) 
1 1 ID(20) tFVT{20) tBETt 3Ct30) t IQ( 5} tPRT« 6) tEROkC 5J ,SRTI5 ) 
2fVEC{90C) tAVALI 3C) tTvALUO! 
NINT=200 

C CONVE^iT CORRFlATION MATRIX TO COVARIANCE V.ATK I XtSTORE 
C IN SIGMA 

00 10 LaltNROw 

00 10 j8l,\R0M( 
IC T{Lt JJ -TC J,L J 

00 11 L«ltNR0W 

00 11 J»ltNROW 
11 SIGVA{LtJ)«TCLtJ) 

NsN+K-l 

SMESO'O* 

c 

C CALCULATE EIGENVALUES AfO VECTORS OF POPULATION 

C COVARIANCE MATRIX, CNLiT Q-j FIRST PASS. UTHEK**<ISfc S< I P 

C TO 771 

C 

GO TO ( 770t77l } 



-3£- 



ERIC 



c 

C#*#* INPUT PARAMETERS 

C < .NUMBER OF GROUPS 

C IP NUMBER OG VARIABLES 

C NROW ORQEk OF COVARIANCE MATRIX 

C N SAMPLE SUE 

C lOalC*TH£ GIVEN VALUE OF OMEGA SO* 

C SIGMA CONTAINS VARI ANCE-COVARI ANCE MATRIX 

C 

770 CALL EIGENZC T»TtMPE»AMDA»DUMl fN^OWtiOtO) 

1PAT«30-NROW 
00 1785 IsltlPAT 

I\PT«NaOw*I 

AMDA( Ii\»PT}«0. 
1785 CONTINUE 
C 

C TEVPE CONTAINS E I G£N-VkCTORS IN DESCENDING OHQEK uT kOOTS 
C AMDAOOJ HAVE EIGENVALUES IN DESCENDING ORDE.'i 

C T HAS EIGENVALUES IN THE DIAGONAL POSITION 
C 

IFCIS<IP-2) 75l»75Q»751 

750 WRITF(6tl022) 

00 601 L»ltNROW 
IPATsl* NR0W#{L-1> 
IRa NROW#L 

601 WRITE{6»1011) LtAMDAtL) .(TEy-PfcC v.) ,vi=iPAT#lK) 

LsO 

WRITE(6tl0ll } Lt {SDP{ I )#I»1»NR0W) 

C 

C GENERATE vaTRIX ALPmA-STAR .STORE IN T 

751 D01614 INPT"1#5 
00 IPAT«li30 

00 914 IST«lf30 
TdPATilSTUO, 
914 CONTINUE 
ObIQ( INPT} 
Q-Q/10* 

IFdP-K+l} 1892#1691f 1891 

1891 S»<-1 
CO TO 1914 

1892 S«IP 
1914 0«-DL06(0J/S 

Q>DEXP{0;*1« 
OsDSORTIO) 

I»K-1 
DO 604 L«1»I 
AT«CIC-L+1J«U-LJ 
TlL#L)»OSQRT{ AMDACD/AT) #(<-L)*0 

M«<-1 

00 604 JnLfM 
LL«J-^1 

TfLtLL l«-l#DS2RT{ AMDA(L)/AT} 
T(L»LL)»TlLiLL)«0 
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604 T(LLtL)«0« 
Sb< 

00 516 L«ltIP 
DO 5i6 J»1#IC 
516 fMC{ INPT#UiWlaT(U#J)»t5S0RT{S) 

GO TO a601tl600)t ISKIP 

1600 wRlTE(6tX615) INPT 
DO 1616 laltNROW 

1616 wRITE{6tlCnU» ( E^ic { I npt • I , j ) , j»i ,k » 

1601 CONTINUE 
C 

C i^ESTORE ALPHA-STAR TO ORIGINAL VAk I ATE-SPACE AUPmA 

DO 630 U«ltNROw 
DO 631 J'iltK 
x=0. 

DO 632 iViaXtNROw 

IPAT*UfNROw» ( M-1 ) 
632 X»TEMP£( IPAT)*T( v,,j}+x 

B£T( JfL)«»X 
631 OVEGA(U,j)»x 
630 CONTINUE 
C 

C GET ALPHA#ALPHA-PRIN1E 
C 

00 633 U»itNROw 
00 634 J»lfNROw 

DO 635 MsltK 

635 XcXfOM£GA(LtM)*BET{V1tJ) 
634 DUM1U)«X 

DO 636 J«l,NROw 

636 T(Lt J)»DUM1{ J) 
633 CONTINUE 

DO 609 UaltNkOw 
DO 609 M«lfNROA 

C 

C 3ET HAS ALPHA«ALPHA-PRlMEtORDtf< IS NkQ^ jy N^Qw 
C T MAS ALPHA. OKDER IS NRQti BY < 

SET( LtMJaT { LtM) 

T{LfV)aOyEGA{ L»V1) 
609 CONTINUE 
C 

C COMPUTE LAMBDA 
C 

DO 612 L>1»NR0W 
DO 612 J«ltNROW 

OMtGAlL»J)«B£T(L»J) *SIGy.A{L#J) 
612 OVECA(LtJ)»OM£GA(LtJ) 

CALL £IGENZ{OMECAtV£CtTVALtDUv,l,.\RU*,30»J5 
S«l* 

DO 23C L=ltNROi* 
230 S»S*AV3A(L } /TvALfL) 



wRITfi{6ti031) INPTtWitKS 
1614 CONTINUE 
NsN-K-f 1 

c 

C RETURN TO XAIN TO GET SAMPLE CORRELATION MATRIX 

RETURN 

C 

C LCO,^ Ff . A SAVPLE SETwEEN GROUP .MATRIX 

771 SA >N 

SA,v,-SAy/< 

C 

C GET The OETLRMINANT OF W AND STORE IN SIDG 



iGOsl 
YaDA8S{SIGMA( 1»1) I 

730 lNDXe2 

723 IFC Y-10*#I\DXI 724»724t725 
725 INDX=INDX-^1 

GO TO 723 

724 INDX=INDX-1 

731 00 727 LaltNROW 



00 727 J»ltNROW 

7 27 SIGMA{ Lf J) «SIGMA(L»J) /10**^NDX 

C 

C COVPUTE EIGENVALUES OF wITmIN GROUPS SSCP MATRIX m 

CALL EIGENZCSIGMAfVECf WVALtDJMltNROwt30»0) 
GO TO {200t201 ) flSiClP 
2"! 00 2C2 L«l»NROw 

IPATal*NROW» (L-1 ) 
IR«NRO«</»L 

2^2 ;^RIT£(6tl011) L#WVAL{L) »(VEC(M> tM-IPATtlR) 

2O0 CONTINUE 
C 

C»»##GENtRATt BETWEEN GROUPS SSCP MATRIX B 

DO 614 INPT»l»5 
DO 701 I»ltIP 
DO 700 J»1.K 
CALL NOt?MAL<XJ 

SsDSQRTC AVDAf I >/SAM) 
STAAsX#S+EMC( I\PT»I »J) 
T n t J) «STAX 



700 CONTINUE 

701 CONTINUE 

GO TO {203.204J tlSKIP 

204 00 205 LaltNROw 

205 wRITfc{6tlOH) LtlTlLtJ) tJ-lfO 
2 03 CONTINUt 

C 



C r,ET VECTO:? OF GRAND MEANS t STORE IN OMEGA 
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00 03 InUlP ^WHmMX. 

DO 705 JalfK 
''O^ OMEGA( ItJI«S/lC 

7"3 CONTINUE 
C 

C VAKE THE BETWEEN GROUP MATRIX 

00 706 laltIP 
GO TO C207t206)tISKIP 
206 wRIT£C6fl011l I • (0«EGA( I tMI #Msl,<j 

2C7 CONTINUE 
C 

C o£T D£;V!ATI0\S OF GROUP YEANS FROM GRAND MEAN 

00 707 J»lfK 
727 T( I f J)aT{ I f J)-0M£GA{ 1 tj> 

7:^6 C0r>iTlNUk 

Do 709 I«lfIP 

CO TC C208f209) tISKIP 
2C9 wRITe(6tlOUJ 1 1 IT { 1 1 fM«lftO 

208 CONTINUE 

DC 709 J«if< 
709 O'-^EQAC J,I }aT< If J) 

DO 711 lalfXP 

DO 713 Mai, IP 

Ss0« 

DO 712 JBI#< 

712 SaS+TI I f j)«OVEGAUfM) 

OUMl(M)aS*SAV 

713 CONTINUE 

DO 515 MslfNRO* 
515 8ET( ItM)aOUVl(i»iJ 

GO TO l711#211Jf XSKIP 
211 wRlTE(6tl011 ! If CBETUfM) t.vtal,NROW) 

711 CONTINUE 
C 

C SJESTORE e-STAR TO ORIGINAL VAKI ATE-SPACE BfSTORE IN BET 
IGOsl 

7^0 DO 736 L'lfNROW 

DO 737 J=lfNROw 
X»0. 

DO 738 MBl,\aOw 
IPATaU-^NROw^iV-l ) 

738 XsTEMPEf IPAT)#8ET{ v),j}+x 

737 O^'tGAUfJJaX 

(jO TO { 736f213J flSl^IP 
213 aRITE{6»1011) L»{0MEGAa»M)»M«l9.^ROwJ 

736 CONTINUE 

KjOsICjO^I 
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742 DO 739 L^l^SHQ^ 



DO 739 jeitNROM' ^^OS^ 
GO TO 740 ^ffMm^ 



739 SfcK J#UJaOM£&ACU»J} 



741 DC 720 L»1,NR0a 
DO 72C JaXffN.VO^ 
8ET(tf JJ»0VFGA(t.#J)/10**Z\0x 

0V!EGA«L»J)»8fcTCu»wJ*Sl;iVA{LtJ> 
720 CONTINUE 
C 

C CALCULATt WlL<S UAMdOA 
C H£T a BETAfcEN GKOJP .v.Ar^lX 
C OVEGA HAS TOTAL 

C GET The determinant of w>3 AND STu-iE IT I\ x 

CALL EISEN^lOyEGAtVECf TVAl.tOUvi,:,^Uft,3J.O) 
Sal* 

GO TC (^I'^taiSJ tIS<IP 

215 L«0 

W^ITE(6#10U) U#( rVALJL) fLaitlH) 
214 DO 216 J-altlP 

216 SsS*TVAt(LI/AVAuaJ 
E'^lLKal./S 

SaS*( 1. 

Sal,-Y/S 

OVESQaS 

IF(IS<IP-2) 7b4#7&5f7b<» 

755 DO 756 Lal.iSOw 

756 W^ITt{6.1Qll) U#{5IoVA{LtJ)»j«lt.,R0^J 
DO 757 Lalf.f^OA 

^57 ARITEI6tlCll) t.# { jyE6A( L«J) fjalt.^WOw) 

00 75 S Laltfr-<0'A 
758 w-^ITECCtUll) LiU^tTJLf J) tW^ltN^^A) 

«V5RITH(6tl032) I SOX I \KO«v'tK;t"^tINPT 

WSITfc{6f99} t^lL< 
754 C0\Tr,Ut: 

PMEA.H I \PT)aP .'iA .{ I\PT5^i.-Eftls.< 

PRT( I^PT)ai.-£AlL< 

C 

C 0-1 EG A SOUAr^t 
C 

Ya<-1 
YVaN-< 

S«T« r.PTJaOMES- 
OvEA'sr JuPT)aOvi£A'.{ rcPT )*s^vt5Q 

ESOR( I \PTJ«Ov = sa*l'j{ INt^T)/lj, -l . 
EVEA>i{ :\PT) »tH'<f<{ l\-»T)*c -'tAM l\PT ) 
SaOMLS'.j+0.0C^5 
ISalO»*;J*S 
Ll ^aNI\T*5 

IF{IS-Liv) 74Ht 749t7i»9 
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749 IS«NI\T ^^Wf^WlABLE 

GOT^ 3750 

3750 <FfiQ{ IStl.NPT) a<F,^jnS»INPn-M 
CONTINUE 

WRIT£(6f 1031) ICOU\T»PRTfSRT»ERj^ 
IFl ICOU\T-\MTXS) 7H4»745f745 
00 l^d Jslt5 

Py£A\( J )aPvtA.\t J)/..VTXS 

746 E VEA(\! ( J)=tVLAj\(j} / isv.T AS 

I«11-2#J 

WRITE {6»10i4) {J,J«l,3JMJ»J«1.5),CJ,j=i,5) 
wRITh(6»1031 i \.«irAS»P,yfeA.>4,0.vitA.stt'LA'. 
00 250 I=l,5 
250 lUdJsiO-IUU) 
i^RIT£{6tl033) IQ 
747 J=1»M\T 
7^7 wRlTr{6tl032) Jt (<FR2{ j.viJ ,.vi«i ,3 ) 

744 N»N-<+l 

RETURN 

99 F0RMATI2X»F15.4) 

1011 PORMATn3,10Fi3.6/{3X,lOFl3.6) ) 

lolf InoT.V.l ^'^^^^ ^"^^ ^I^^'^ VICTORS'/) 

1031 FORN^ATI 2X» I4»15Fa.4) 

1032 FORMAT ( 6110) 

l^rnoV/^^''*' ^liiTRISgTION OF O.viEOh S-JAXf/lOA, 
1034 FORMATC/ 19X, '.VtAN OF l-L.AMaOA'-,21 A , 

2 !l4I8)'^^ ^'"^^^^ SQUARE' •23X, '.vigAN UF ERRORS' //6X, 16 
'^^^ ENO ^^^^'^^'^^''^ AUPHA-STAR MATRIX FC'^ HCP. I L<S- ' . I 5 ) 

SUBROUTINE N0K%1AL(/y/) 
INTEGER K/1/ 
REAL*8 WA»W8 
IF(<.EQ,2) GO TO 3 
WA»RA\32{0) 
WBaRANSZiO) 

WA«0S0RTl-2.«0L0G{ rtA) ) 
WB»aB»6.2B31S531 
Y«WA*0COSC WB) 
K«2 

RETURN 
i y«WA#OSINMS) 
< = 1 

RETURN 
END 



-44- 



ERIC 



B£ST COPY mmi 



HI 



ft 



1*1 



o ^ 9 o «« p ^ q *• o •« d o «• S 5 



* I t • • • 



«0 

s 

lift 



^3 33 93390330330 3330 
33 33333 3a3a3030*30^3%-fo-3*aO*3-3*0*3Va-0-30-0-^^^ 

ssss^s;3£:;7ississs::S5ssstss^$sss:;«»«<^'--«--oatf^ 

o o o o o o a o o o 3 a o , 3 o a o* 3* 3- o* J 3- o' =r o* c? 9* a' o a* o* a* o' o' o* a' a* o* 

3003003030003 Jo*3'3J^VoV3'oV3VoVc;o*303'3'3'o'o'3'o* 



HI 



3 



o 



9 

/I 

3 



3 
3 

ill 

3 
3 



09 



Ml 



^ in tf% 3 « 
4t m 



« 9 • • ft 



> 



«9 
3 



O 



f 



o o o o o Q o o o o o o o o o a 3 o* o o o o o o J o o o o o o a o a' J o' J 
m S ^ M o 9 « M V « «« w o « m o ' o « S S 5 M S S « S « 2f S S £ t: ^ o 'V a 

ooooooooooaooaoo*oo'o'o'oo'o'o"c?a%-o-o-o'o'o'o'oo'o'o'o'o'o 



I 9 ^ ^ tft A 



i** 3 or 
A. Oftrf C Q» 
4*1 M < u 
o o«p«> 

dH 

ft^ IM ^ O 
S fl» -ft9 9 

• *S 'Ml 

» S « CB > 



*. * ••.".'".-l"".'".'"."*.*.'".'" 1 * * « « i * i s «■ 5. ss; . 

oooaaoooooooooaoo'ooo'o'o'o'o'o'o'o-o'a'o'o'o'o'o'o-ooa'o'o 

^^^'^^--.^-•^^2- '2*222^^^^ 2 SIC 

-«<ommoo2:fi<5o SrSISS *oS25 S«gS2S2 aj^f^^ooiw^-^,^ 
oooooooaooooooaoooo aoa o a o o i?o o o o o'o o'a'o'a* 3'=' a* 









i> 












• 




0 




m 
















• 


3 


3 






















ol 






# 


s 


0 


«^ 










m 


X 










• 


r 


3 




m 




«^ 








IF 




• 




0 




c 
















• 






















0 


• 


flS 


3 


s 










tn 


« 


<u 












• 




0 










ui 


«« 












m 












m 












• 




3 








9 



ERIC 



? S S S " " 

s 0 S !! -J o 



« 0 n J! " 0 

; s ; ? ; 



0 fi 

' b S j; " 0 



2 

>o S 0 S " » 

" 0 " n 2 " 

" fi n 2 " 

" 0 0 » 1 0 

n n » 

"no ^ ° 

" " S n S 

" 0 0 o « • 



« 0 n o 

0 S ^ ^ I 

" 0 0 n ° 

^5 n M ? 0 



n n i: ? 2 



^! 0 S S 2 (! 

0 0 



6 r. 2 * 

I? ° " 0 ? 

0 n M ' 

0 S " i 

" n n 2 ^ 

^ 0 n M 

« n n ? ^ 

0 0 S i ° 

« 0 S 2 3 

0 D S J » 

I n ^ ^ 

^« 0 0 n 2 ^ 



^ " ? P 

n n n A * 

p n ^ ^ 

" n n S ? " 



ERIC 



59 0 0 0 

60 0 0 0 
*t 0 0 (1 



6S n 0 



*»3 0 0 0 



0 0 u 

^0 0 0 0 

0 0 I 

^8 0 0 0 

^5000 



7a 0 0 



/9 0 0 



83 0 0 

e« 0 0 

ttS 0 0 

0 0 0 



U6 0 0 

tia 0 0 

U9 0 n i 



I u 
I 0 

1 I 



0 0 0 



0 t 



t» 1 0 



tl 0 0 O p 

0 0 I to 



f7 0 0 0 n 1 



J8 n 0 0 a 

to 

« 0 n 

0 0 
0 0 
0 0 



0 1 0 



/5 0 n 0 J 0 



1 U 0 

0 1 0 

P « 0 



«n ft 0 1 a 

0 0 ti 

0? ft 0 0 



«7 0 0 

ft 0 i 

90 0 n ci 

91 0 0 0 
93 ft 0 * 

9« ft n 0 

0 0 0 

96 n 0 

97 0 n I 

98 ft n ft 

99 0 ft 0 

too n n I 

lot a ft 0 

It'? D 0 n 

103 n 0 

lOfl 0 0 0 

1«J5 0 0 0 

106 0 ft 

107 0 0 3 

loa 0 0 3 

109 0 0 1 

no n 0 s 

111 n 0 2 

112 0 0 c 
115 0 0 

i}f ^ « 2 ft 0 

0 0 n 0 g 



ft 

1 0 
0 0 

ft ! 0 

0 n 0 

2 0 

0 0 

^ i Q 

0 0 0 

ft 0 

1 0 

0 0 

ft 0 

! fJ 0 

n 0 

1 0 
1 0 
ft 0 
i 0 

ft 0 

n 0 

3 0 
" ft 0 

0 0 
ft 0 

ft I 0 

1 0 
0 0 
0 ft 
0 0 
0 0 

ft 0 

0 1 0 



f» ft 0 

3 0 0 

ft 0 

•J 0 



id? 

U$ 
lit 

I eh 

tai 

)2A 
1«?9 
liO 
lil 
li? 
lil 
!i4 
lis 

l^T 

mn 

J«l 

145 
]<(4 
145 
146 
147 
146 
t4Q 

Ibn 

151 

158 

153 

154 

15S 

156 

157 

158 

159 

tho 

IM 

lb? 

Ib3 

164 

lfc5 

166 

167 

16« 

Jfe9 

170 

171 

1 7e! 

1 n 

174 
175 
176 
177 
1 7« 
179 
160 



n 

0 

0 

c 
n 
n 
n 
n 
0 
0 
n 

0 
n 

0 

0 
0 
n 
n 

0 

n 
n 
0 
n 

r 

0 

0 

0 

n 
n 
n 
n 
n 
n 
0 
0 

0 
n 
n 
n 

0 

n 

n 

0 
r\ 

0 
0 

n 
n 
n 
1 

0 
0 
0 



0 



0 
ft 

n 
f» 

a 

i 

n 
n 
0 
0 
1 

0 

n 

3 
1 
1 

n 

0 
0 
0 
4 

1 
I 

CJ 

1 
P 

9 

? 

n 

s 
n 
1 
I 

0 

I 

n 

I 
! 
I 

0 
n 
f) 
(1 
n 
1 

D 

n 

n 
n 

0 

t) 

n 
n 
0 
0 

0 
0 

n 



i 

0 

u 

0 

f1 
ll 
0 
0 
li 

0 

( 
f» 
fl 
n 
i 
I) 
t' 

CI 

n 
n 

o 

fi 
(t 
n 
n 

n 
fj 
n 

V 

(1 
f) 
u 
f) 

0 

n 
n 
(I 
r 
n 
f» 
n 

0 

r 
(< 
II 
(1 
n 



0 

n 
ft 
fi 
n 

0 

n 
n 
n 

0 

n 

0 

n 
«. 
n 

0 

n 

0 

0 

n 
n 
n 

0 

n 

0 

0 

(] 

0 

n 
n 

0 

n 

0 

n 
ft 

0 
n 

0 
n 
n 
fl 
ft 

0 

ft 
n 

0 

{} 

0 

n 

0 

n 
fl 
n 
ft 

n 

0 

n 
n 

n 
(1 



D 

0 
i1 

n 

0 

n 

0 
0 



0 
0 
0 

u 
ti 

0 

CI 
ft 

n 

n 

0 

0 

ft 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

0 

n 

0 

n 
ft 
n 
n 
n 

0 
0 

0 
0 
0 
0 

0 
0 
0 

fj 

0 

n 
0 

0 

0 

0 

n 

0 
0 
0 
0 



1 t\ 


4 


0 




n 


1 Oc 


3 


0 


1) 


n 




U 


0 


n 


n 


1 Mil 


n 


n 


n 


n 


i 

1 


.1 


0 


n 


n 


1 Ofl 


4 


0 


0 


n 


10 7 


s 


D 


ti 


0 


I AH 


t 


0 


0 


0 


1 09 


3 


0 


u 


0 


1 ''n 


n 


0 


0 


n 




t 


n 


11 


0 




0 


0 


0 


ft 




n 


0 


0 


n 


i CI 


t 


n 


0 




! 


n 


0 


0 


n 


1 96 


0 


n 


0 


0 


1 

I ^ f 


n 


n 


r« 


0 




n 


n 


n 


n 




0 


n 


V 


n 


2pn 


0 


CI 


fi 





n 
n 

0 

0 

Ci 
0 
0 
0 

u 

0 
0 
0 
0 
0 
0 
0 
0 
0 
0 
0 



Notes * (a) l^e heading In each coluon is lOu^. 

(b) Each class interval encends from 5(ja-l) x lO"^ to 5m x 10~^ - 10*^ 
where a is the class-interval nuaber shown in the first colunm. 
Thus, for exan^le» the last non-empty interval for « 0.9 
with n « 194, is [.9650, .9699+]. ' 
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fbm tcpalaticn Cotrelatiim Matrices 



Wot tbe lO-varlAte cases, cbe correlacimi natrix coanm to the 
sets of five mtOatloM V9a as steiai telov, sotsMted to foer decimal 
places • 

For the five- and t^'xee-^rariate ca^, the i^er left-^Bi4 5x5 
and 3 s 3 segaents, re^^ actively, of this matrix was used. 

Xn each case, the ^rrelatiaa aatrix tiaa pre« sa^ post-aultiplied 
by an arbitrary diagonal aatrix to gmrate the coam covariance 
matrix. 



4 



8 



10 



1.0000 

.1875 1.0000 

.0833 .2000 1.0000 

.2500 .2500 .1667 l.(KXM> 

.1875 . 3125 .5000 .3000 l.OMX) 

.2917 .0833 .3333 .1667 .4167 1.00(») 

.4250 .5000 . 3333 •6m .6125 .3333 l.OOOO 

.2250 .1250 .2917 .2250 .40CH> .6000 .2000 1.0000 

.3750 .2250 .ZSOO .2000 .»KM) .^000 .3000 .1250 1.0000 

.1000 .4000 .2667 .UOO .5«H> .2U7 .0900 ,2000 .1<K» l.MOO 
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'*Ortliodox" Saa^Ui^ Proce^sre 
In view of tl» unespeetedly high positive bias In ^ deemed 

edvls^le to Bake sure that tills was not a tesult of tlie aethod, des- 
erlhed In Section 3, that «S8 ised for generating tike ^ts of popular 
tloas so as to have preasalgned values. That aathod involved sUsa* 
taneous dlagooalizatlon of the eosmon covarlacee isatrlx C and the cross 
product Ml* of tl» effeet<-paraseter aatrix. 

^cordlngly, the saapUng distribi^lon of «2 ^^len »2 • o.l was 
c«mstnicted for the case uhen p«iS-58ndN»7S, vith the popolatloas 
generated by an alternative aetiu^ more true to real life. 

A c(mv«iientlsr avail^le data deck containing, mag other things, 
scores <m five stoidardised achieveaent tests for soae 260 ninth-^a<te 
stilts was used as <»e of the five populationsi this is referr^ to 
as P^, (Since saapling was to be done with teplaeei^t, this modest 
population siee was <temd sufficient.) Bach of the rmnaining four 

populations, P4, was conceptually (but not physically) gerora- 

ted as follows. Every student's score on any given test was increased 
by a ssall constant aaotnt, the aisouat varying frcnn test to test (as 
weU as from iK>polation to population). This was to insure that the 
five population covarioice ssat rices would be exactly equal. 

The active eimstaits were determined in tlm foll<wing o^er: 
For ^ J-th fictitioiw population (j . 1, 2, 3. 4), the amounts by 

i^ich everymie's scores on X^, s^re to be incremented were 

set at c«^^, c«j2 «*j5. respectively, where («^^, S^^t ^^5) 

is a separace ranitoa perwitatiea of (.1, .2, .... .5) for each J, and 
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e is a cmisteac to be deceREdned to chat » 0.1. 

Fter eaai «2 ^^lua to be cM^oted, a roMloQ saiai^e S,- steuld 
dr«m from aadi motion (J - 0, X, 2, 3. 4), and the idLthl». osd 
betweea-«KM^ taatritMNi f »d B ceayuted. In aec«ality. hoiimr, fiiw 
Bea^l^ Soof Sol» ••'» aay be dtmm fvea P^,, and st^sequent eoa^u-. 
tatlimal adjustaents sate <diere necessary. For compuclng W, no adjust- 
WB&t» an nead^s for B the requixed adJiMtaaits axe aa described 
belo9. 

Ut tiis coBtrolte o€ tiie five aaqplea S . be 

?*«J " ^*ojX, ^oJ2. •••• ^ojsJ* CJ - 0, 1 4) 

Th«i the centroids that inmld have been ohMirved if the easfiles S 

CJ • 0. If 4> had been drasm ares X* (^served) for J - 0. and 

^'jj • ^oj «j2. •••• «j5^ for J - X, 2. 3, 4 . 

the vector of gr«id ne^is for the totaX 8fflii»Xe cmsprlslag Soo» SXX, 
...» S44 is 

?' "li^«^«X, «2, •••• V • 



vfaere 



j-X 

Ihus, the devlationfl of the group ceatreida froa the grand centroid are 

^•ji - ?• ■ ^*0J - ?;> - ^J«JX • ^X, «J2 - ^2. V - V • 
tHiese the I'^ are found fron the aasqOes actuaUy drawn (aXX from P^), 
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a»d the «U«.««t «« U c«^„t*U «ce tiK. «H. c .» i,^^. 

Now c(6^p - 5p) is precisely the (J. P) eXeaent of the transpose 
a' of the effect-para8»ter satrix* Thus, the population «2 defined la 
equation (2.1) is here expressible as 



«2 « 1 « 



\Z + c2<AA«)/5| 

Where £ is the covarimce satrix of P© (and, equlvaleatly of Px, .... 
P4} and 

"*1 "''2» •••» ""5 

*U 'h' *12 - •••• *15 " ^' 

A« - 

^1 • ^1' ^42 " *2 *45 " '^5 



since A* is deteradLoed cmce m select the four random persaitations 
^*J1, *j2, •••• *j5^ ^•^» ...,-.5), and £ is given by the original 
data, ta^ is a functi<m solely of c. (It is clearly a t&onotonely de- 
creasing funcUoB of |c|.) By a trial-^d-erzor process, the value 
of c a^Ong «^ - 0.1 to four deciaal placid was detenained. 

The aaapUng distribution of 1,000 valti^s of cc^uted in the 
foregoing saanner, groi^ in class intervals of size 0.03 each, was as 
slmm in the row labelled f^ below: 



'A 



3 16 23 62 74 120 140 141 131 98 88 44 30 19 8 2 1 
9 14 28 79 82 121 145 123 127 101 82 51 19 11 7 1 0 
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(The class intervals are, froa left to right, .0600-.0899» ,0900-.1199, 

5400-.S699.> In row £^ above is shoim the saa^lins distributioa 

of 1,000 w2 values generated by the sas^ling procedure of Section 3. 

A visual co^rismi of th« two distributions shows that they are 
quite sioiUr. A chi-square test for the significance of the difference 
between the two distribuUons (with the last three class intervals 
collapsed into one) yielded ■ 13.31, df» 14 (p z .50). • 

Thus, it may safely be concluded that the two distributions differ 
only by saa^ling error. The appretension that the hi^ positive bias 
of w2, especially for «2 « o.l with large p and small », sight have 
resulted from the peculiar manner in which the populations were genera- 
ted in this study may therefore be cast aside. 



